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Abstract —In the present paper, we propose a broadcast ARQ 
protocol based on the concept of index coding. In the proposed 
scenario, a server wishes to transmit a finite sequence of packets 
to multiple receivers via a broadcast channel with packet erasures 
until all of the receivers successfully receive all of the packets. 
In the retransmission phase, the server produces a coded packet 
as a retransmitted packet based on the side-information sent 
from the receivers via feedback channels. A notable feature 
of the proposed protocol is that the decoding process at the 
receiver side has low decoding complexity because only a small 
number of addition operations are needed in order to recover an 
intended packet. This feature may be preferable for reducing the 
power consumption of receivers. The throughput performance 
of the proposed protocol is close to that of the ideal FEC 
throughput performance when the erasure probability is less 
than 0.1. This implies that the proposed protocol provides almost 
optimal throughput performance in such a regime. 

I. Introduction 

Recent strong demand for handing data sets of extremely 
large size has produced a number of situations in which 
a server must send a huge file to multiple clients over an 
unreliable channel. A simple example is the distributed backup 
of a mission-critical file system. In order to avoid a devastating 
incident due to a natural disaster, distributed backups at distant 
locations are of critical importance. It is common to use 
multicast protocols to reduce the bottleneck traffic at the server 
for such applications. Since IP multicast is based on the User 
Data Protocol (UDP), which is not so reliable over an IP 
network, an IP multicast protocol, such as a reliable multicast 
protocol, is proposed in order to ensure reliable content distri¬ 
bution. Another possible scenario is content distribution, such 
as HD video streams in cellular wireless systems. Suppose 
that a base station (i.e., server) wishes to share a large file 
or bitstream with multiple mobile terminals. In such a case, 
careful design of a protocol is necessary in order to achieve 
sufficient throughput while maintaining a certain degree of 
reliability. 

These situations can be abstracted as a problem of sharing 
identical content with multiple receivers over an unreliable 
broadcast channel. A broadcast channel is a channel over 
which receivers can listen to what a server has sent over 
common media: wireless signals on a specific band or a 
multicast network. In order to achieve high reliability (i.e., low 
error probability) of data and high throughput, several coding 
techniques and protocols have been proposed for broadcast 

channels q gi tain am. 

A prominent example is Automatic Repeat reQuest (ARQ) 


protocols, which are often used for reliable content distribution 
over a broadcast channel. In an ARQ protocol, a receiver sends 
a request for retransmission to the sever if a receiver receives 
a broken packet or detects a packet loss. The server resends 
the corresponding packets when it receives a request from a 
receiver. Simple protocols for single-to-single communication, 
such as a Go-Back- TV protocol and a selective-repeat protocol, 
can also be used for broadcast channels. However, direct 
application of such protocols to a broadcast channel often 
causes significant degradation of the throughput performance 
because such protocols do not consider packet losses in distinct 
receivers. 

On the other hand, a specialized ARQ protocol using For¬ 
ward Error Correcting (FEC) code provides excellent through¬ 
put performance over broadcast channels. Metzner proposed 
an ARQ protocol based on Reed-Solomon codes for broadcast 
channels El. His protocol yields much higher throughputs 
than those of single-to-single ARQ protocols applied to a 
broadcast channel. Chandran and Lin presented a Selective- 
Repeat ARQ protocol for broadcast channels E). Sakakibara 
and Kasahara introduced the concept of hybrid ARQ based on 
GMD decoding to Metzner’s protocol and reported improved 
throughput performance 0. Recently, growing demand for 
real-time multicast, such as video streaming, have stimulated 
research on FEC-based protocols based on Reed-Solomon 
codes or sparse graph codes. For example, digital fountain 
codes 0 El provide high throughput performance without in¬ 
troducing complex encoding/decoding operations at the server 
and receiver sides El 0- 

The concept of index coding proposed by Bark and Kol 
m has had a significant impact on research into coding for 
broadcast channels. Index coding is a coding technique to 
achieve better bandwidth efficiency of a broadcast channel. In¬ 
dex coding can use side-information to improve the throughput 
of the protocol. The concept of index coding is described as 
follows. Let us assume a broadcast channel with one server 
and multiple receivers. The server has a packet sequence, and 
each receiver knows a part of the packet sequence, which is 
referred to as side-information. A receiver is assumed to have 
its own need for packets, i.e., each receiver wants to know 
a part of the packet sequence. The server perfectly knows 
the desired part of the packet sequence and side-information 
for each receiver. Based on such knowledge, the server can 
produce coded packets by combining the original packets, 
which are sent to the channel. Appropriately coded packets 
using index coding can satisfy all of the demands of receivers 
and reduce the number of packets to be sent. Theoretical 


aspects of linear index coding are discussed in Bar-Yossef et 
ai. m, who showed that the minrank of the side-information 
graph gives the shortest code length of linear index coding. 
A number of theoretical studies on index coding have been 
conducted. For example, the relationship between index coding 
and network coding is discussed in UU. From a practical 
point of view, finding appropriate combinations of packets is 
the most difficult part of the index coding process. Bark and 
Kol 0 proposed a greedy type algorithm for searching large 
cliques in a given side-information graph. Several efficient 
algorithms based on a graph algorithm or SAT solver are 
discussed in Ga¬ 
in the present paper, for reliable content distribution over 
broadcast channels with symbol (i.e., packet) erasure, we 
propose an ARQ protocol, referred to as index ARQ protocol, 
based on the concept of index coding. The goal of the proposed 
scenario is to share a packet sequence with all receivers 
participating in the protocol. Due to packet erasure, some of the 
received packets are missing at certain receivers. The proposed 
protocol does not rely on conventional FEC to compensate 
such packet losses. However, unlike conventional ARQ proto¬ 
cols, the proposed protocol performs an encoding procedure 
similar to index coding to produce a retransmitted packet. The 
states of the receivers are fed back to the server and such state 
information is used to make an appropriate coded packet. A 
coded packet is constructed by superimposing several packets 
over a finite field. This coding process resembles index coding 
based on the greedy clique algorithm proposed by Bark and 
Kol 0. Successfully received packets at each receiver act as 
side-information, and these packets can be used to improve the 
bandwidth efficiency of the system. In the proposed protocol, 
one packet may compensate several packet losses at several 
receivers. Another notable feature of the proposed protocol is 
that the decoding process at the receiver side has low decoding 
complexity because a small number of addition operations are 
needed in order to recover an intended packet. This feature may 
be preferable for reducing power consumption in receivers. 

II. Preliminaries 

In this section, we introduce the notation and definitions 
used throughout the present paper. 

A. Broadcast channel 

Figure [I] represents the broadcast channel assumed in the 
present paper. A server S wishes to share a packet sequence 
P = (pi,P2, ■ ■ ■ ,Pn) £ F” with TO receivers R u ..., R m . The 
symbol ¥ q denotes the finite field with q elements, where q is a 
prime power. An element in p, pi, is said to be a packet. Each 
receiver wishes to obtain whole packets in p. In other words, 
the goal of communication over this channel is to distribute p 
to all of the receivers. 

For simplicity, we assume that the server S can send a 
coded packet (or an uncoded packet) £ F" to the channel at 
the discrete time instant t £ N, where N is the set of positive 
integers. The time interval between two consecutive packet 
transmissions is assumed to be sufficient to accommodate a 
packet. In other words, two consecutive transmitted packets 
never collide with each other. The receiver Ri(i £ [l,m]) 
receives the received packet v\ = ** with probability 1 — 
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e(0 < e < 1); otherwise y\ = E with probability e, where the 
symbol E represents the erasure symbol. An erasure can be 
considered as the occurrence of a packet loss on the channel. 
The occurrences of erasures (i.e., packet losses) are assumed to 
be independent (i.e., the channel is memoryless). The notation 
[a, b] represents the set of consecutive integers from a to b. 

In the initial state of the protocol, none of the receivers have 
knowledge of the contents of the packet sequence p. The sever 
continues to send a sequence of coded packets •Xt , 3C , .I , . . . 
until all of the receivers successfully obtain all of the packets 
in p. At time t, the packet indices corresponding to the packets 
that were successfully received by Ri are denoted by K.\ C 
[1 ,n] (known indices). The indices of unknown packets are 
represented as W| C [1 ,n\ (wanted indices). Based on these 
definitions, 1C\ U Wf = [l,n] holds for any i £ [l,m] and for 
any t £ N. 

For any time t, the information on W- (or equivalently 
KL\) is fed back to the sever via a noiseless feedback channel 
before x t+1 is sent to the channel. In other words, the server 
S always has perfect knowledge of the known packets for 
all receivers. Here, we assume that the size of a packet is 
much larger than the index information communicated via the 
noiseless feedback channel. This means that the capacity of the 
feedback channel can be much smaller than that of the forward 
broadcast channel. For example, in a multicast scenario, the 
noiseless feedback channel can be implemented using reliable 
TCP connections. 


B. State matrix 

As described in the previous section, the server S perfectly 
knows the states of the receivers. The state matrix is a matrix 
representation of the knowledge of the server. The definition 
is given as follows. 

Definition 1 (State matrix): An m x n binary matrix C = 
{Cij} is said to be a state matrix if (i,j) element Cij(i £ 
[1 ,m],j £ [l,n]) is given by 

_ J 1 , Ri knows the content of packet pj , .... 

l ' J ~ 1 0, otherwise. ' 























If the context requires the time instant (or time index) to be 
specified, we will use the notation C 4 , which clarifies the 
dependency on the time index t. Otherwise, we omit the time 
index in order to simplify the notation. A row of a state 
matrix corresponds to a receiver, and a column corresponds to 
a packet. For example, assume that the server S has a packet 
sequence p = (p \, p->. pf) and wants to distribute the sequence 
to two receivers f?i, f? 2 - At a certain time, the state matrix is 
given by 


0 0 1 
110 ’ 


( 2 ) 


which represents the state of the entire system. In this case, the 
receiver f?i knows packet p :i , and receiver R 2 knows packets 
Pi and p 2 - 


C. Clique matrix 

Let I = (ii, i 2 , ■ ■ ., it) C [l,n] be an index sequence 
(i.e., ordered set) satisfying i \ < i 2 < • • • < it, where 
£(< n) is a positive integer. Assume that a state matrix 
C = (c 1 ,c 2 ,...,c n )(c i e {0, l} m ) is given, where c t 
represents the ith column vector of C. The submatrix of C 
indexed by /, which is denoted by C7, is defined as 

Ci (('ii , Ci 2 ? * * * 5 ) • (3) 


The following definition describes clique matrices that play 
an important role in encoding and decoding processes of the 
proposed protocol. 

Definition 2 (Clique matrix): If an s x r binary matrix A 
satisfies the following two conditions: (1) every row of A has 
a weight greater than or equal to r — 1; (2) A does not contain 
a column with column weight s, then the matrix A is said to 
be a clique matrix. 


For a given state matrix C, if an index sequence I provides a 
clique matrix Cj, then I is called a set of clique indices of C. 
The term “clique matrix” comes from the clique based-index 
coding method presented by Bark and Kol |9)- A state matrix 
can be seen as the adjacency matrix of a side-information 
graph. Under such an interpretation, a clique matrix corre¬ 
sponds to a clique in a given side-information graph. Next, 
we present an example. Assume that the system has the state 
represented by §. A sub-matrix indexed by I = (1,3) 


C, 


0 1 
1 0 


( 4 ) 


is a clique matrix. Therefore, I = (1,3) is a set of clique 
indices in this case. 


III. Index ARQ protocol 
A. Encoding process of index ARQ protocol 

We assume that the server S can send a coded packet to the 
channel at any time index until all of the receivers successfully 
obtain the entire packet sequence. The encoding process of the 
index ARQ at the server side is summarized as Algorithm [T| 

The key of the encoding process is ([5]) and (|6| in Algorithm 
flj We here focus on these steps. The function / 4 : F”' x 11 —► 
2 1 finds a set of clique indices from a given state matrix. In 


Algorithm 1 Encoding process at the server side 
1: t := 0. 

2: C 4 := 0 mx ". (0 mxn represents m x n zero matrix) 

3: while C 4 contains a zero element do 
4: 

/ 4 := f t (C t ). (5) 

5: A coded packet is constructed as 

xt :=Y^,Pr ( fi ) 

jei ‘ 

6: Send x t to the broadcast channel. 

7: t:=t+ 1. 

8: G* is obtained based on the feedback information. 

9: end while 


other words, Cj is a clique matrix in C, where I = / 4 (C). The 
function / 4 is referred to as an index generator. The details 
of an implementation of an index generator are discussed in 
Subsection |III-C| The encoding process at the server side 
continues until the state matrix C has no elements with the 
value one. 

According to (roll, a coded packet x 1 £ F g is encoded 
by adding packets naving indices in I. Several packets are 
superimposed over F g to produce a coded packet. This coded 
packet can thus satisfy the demands from several receivers 
simultaneously, i.e., a coded packet can compensate multiple 
unknown packets in several receivers. This provides the ad¬ 
vantage of the proposed protocol in terms of throughput and 
bandwidth efficiency. 

B. Decoding process of index ARQ protocol 

The receiver R t receives symbol y\ from the channel. We 
assume that each receiver knows / 4 via the header information 
attached to a transmitted coded packet. The decoding process at 
time index t in the receiver Ri is summarized in Algorithm [2] 
The packet pk is an unknown packet before decoding because 


Algorithm 2 Decoding process at the receiver Ri 

l: if 'if: = E then 

2: Quit the decoding process. 

3: end if 

4: if I 4 n W 4 = 0 then 
5: Quit the decoding process. 

6: else 

7: Let A: be a unique index in 7 4 IT W. 4 . 

8: end if 

Pk-=y\- pa (7) 

9: W 4+1 := W 4 \{fc}. 

10: Send W 4+1 to the server via the feedback channel. 


k £ W 4 . Note that R., knows the packet pj for any j £ I 4 \{fc} 
due to the definition of the clique matrix and clique indices. 
The reconstruction rule 0 is an immediate consequence of 
the encoding rule (|6j. As an example, encoding and decoding 
processes are depicted in Fig. [2] 


















Fig. 2. Encoding and decoding processes of the index ARQ protocol. The 
server broadcasts a coded packet x t . The receiver R \ obtains y\ and attempts 
to retrieve the packet p 2 using the side-information. In a similar manner, the 
receiver R 2 can recover intended packets. 


C. Index generator 

As described in the previous subsections, a coded packet 
consists of several original packets. In order to improve the 
bandwidth efficiency, we need to increase the number of 
superimposed packets in an encoding process. This means that 
an index generator that is able to find a larger clique matrix 
from a state matrix is a preferable choice. In this subsection, 
we will present such an index generator. 


In the present paper, we use the following strategy to design 
an index generator. The first phase of the protocol is defined as 
a sequence of time indices (1,2 ,,n). The index generator 
outputs {1} if t is in the first phase, i.e., t £ [1 , n], In other 
words, the original packets pi,p 2 ,... ,p n are sent directly to 
the channel in the first phase. The second phase of the protocol 
(t > n) can be considered to be a retransmission phase. In 
the second phase, unreceived packets due to packet losses are 
gradually compensated as the decoding process proceeds. The 
index generator used in the second phase is based on a greedy 
algorithm to find a large clique matrix. The problem of finding 
the largest (in terms of the number of columns) clique matrix 
is closely related to the problem of finding the largest clique 
in a given undirected graph (maximal clique problem). As in 
the case of the maximal clique problem, we cannot expect the 
existence of efficient algorithms for this problem. In the present 
paper, a heuristic greedy algorithm for finding a clique matrix 
is used in the second phase. In summary, our index generator 
has the following form: 


f(C) 


{(}, t £ [1, n] 
F(C), otherwise, 


( 8 ) 


where the function F represents a greedy process to compute 
clique indices. 


The greedy search algorithm presented below is a random¬ 
ized greedy algorithm for finding a set of clique indices. This 
algorithm is used to implement the function F(C). 

The concept of the algorithm is simple. In line 6 of 
Algorithm [3j a column of the state matrix c k is randomly 
chosen and is appended to the current candidate of clique 
matrix A if ( A , c k ) does not violate the condition of the clique 
matrix. This process continues until all of the columns in C 
are tested. The overall time complexity of this greedy search is 
0{n 2 ) when n = m. The randomness on the column selection 
is incorporated because it provides robust system performance 
with regard to the delay of feedback information. 


Next, we consider a simple example. Assume that we have 
the state matrix C, as in ([2|, and that the order of the random 


Algorithm 3 Greedy search algorithm 

1 . S . {ci, C 2 ,..., (C (ci, C 2 ,..., c n )). 

2: A := (). 

3: I := 0. 

4: while S 0 do 

5: Select a vector Cy from S uniformly at random. 

6: if c k is not the all-ones vector then 

7: A' := (A, c k ). 

8: if A' is a clique matrix then 

9: A := A!. 

10: I ■= I U {k}. 

li: end if 

12: end if 

13: S := S\{c k }. 

14: end while 
15: Output I. 


column choice is second column -A first column —t third 
column. In such a case, the second column is first accepted 
and the first column is rejected because it forms a non-clique 
matrix. Finally, the third column is accepted, and the algorithm 
outputs I = (2,3). 

Although this algorithm depends on a simple greedy strat¬ 
egy, the greedy algorithm has been empirically observed to 
produce clique indices near the optimal (i.e., largest) size. 

IV. Computer experiments 
A. Details of experiments 

In this subsection, we describe the details of computer 
experiments for evaluating the throughput performance of the 
proposed protocol. We assume the broadcast channel shown in 
Fig. □ As benchmarks, we evaluate the performance not only 
of the proposed protocol but also of the Selective-Repeat (SR) 
protocol and the Metzner protocol. 

The SR protocol may be the simplest ARQ protocol for 
compensating the packet loss under this channel model. The 
details of the SR protocol are as follows. As in the proposed 
protocol, the server transmits the original packet p t when 
t = 1.2,..., n (phase 1). At every time interval, each receiver 
reports its demands (i.e., state of the unreceived packets) to 
the sever via the reliable feedback channel. In phase 2 of this 
protocol, the packet with the smallest index among all of the 
requested packet indices is sent to the channel at time index 
t > n . When the server receives ACK from all of the receivers, 
the server terminates the transmission process. The advantage 
of the SR protocol is its simplicity. No special operations are 
required for encoding and decoding. However, the SR protocol 
cannot provide a coding advantage to improve the throughput. 

The Metzner protocol is an FEC-based ARQ protocol for 
broadcast channels that is based on the erasure correcting 
capability of Reed-Solomon codes. In the Metzner protocol, 
Reed-Solomon coded packets are sent to the channel, and a 
receiver that obtains n packets from the channel can execute 
an erasure correcting process (i.e., solving simultaneous linear 
equations over F 9 ) to recover the packet sequence that the 
server possesses. The primary benefit of this protocol is the 
near-optimal bandwidth efficiency it provides. A drawback 
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Fig. 3. Relationship between throughput and erasure probability (e : 0 < 
e < 0.1, number of receivers m = 100, number of packets n = 1000). 

of the protocol is that every receiver requires to solve an 
erasure correcting problem based on a Reed-Solomon code that 
requires a certain computational power at the receiver side. 

In the present paper, we adopt throughput as the main 
performance measure. The throughput of a protocol is directly 
related to the bandwidth efficiency of the protocol. Let N be 
the total number of transmitted packets from the server until 
the protocol terminates (i.e., all of the receivers obtain the 
entire packet sequence). Note that this number IV is a random 
variable depending on the randomness of the erasure channels. 
The throughput r is defined as r = E[n/N] which represents 
average amount of information per transmitted packet. In the 
following subsections, the throughput r is estimated through 
computer simulations of these protocols. 

The capacity of a single erasure channel with erasure 
probability e is given by 1 — e. The coding theorem proved 
by Shannon ED guarantees the existence of sufficiently long 
FEC-codes with coding rates below 1—e that achieve arbitrarily 
small error probabilities. Suppose that a server uses such an 
FEC-code with a coding rate close to 1 — e. Although such a 
system suffers from large latency due to long code length, a 
throughput close to 1 — e can be achieved. In the following 
discussion, we use such a system as a benchmark for the 
throughput performance. The upper bound of the throughput 
r = 1 — e is referred to hereinafter as the ideal FEC bound. 

B. Results of computer experiments 

Figure [3] shows the relationship between throughput and 
erasure probability e (0 < e < 0.1). In these experiments, 
100 trials were conducted for each point on the curves. The 
number of receivers and packets are assumed to be m = 100 
and n = 1000, respectively. The results for three protocols, 
including the proposed protocol (labeled by index-ARQ), the 
selective repeat (SR) protocol, and the Metzner protocol, are 
included in Fig. [3] 

The Metzner protocol is confirmed to achieve the best 
throughput performance among these three protocols at all 


Fig. 4. Relationship between throughput and erasure probability (e : 0 < 
e < 0.1, number of receivers m = 50, number of packets n = 2000). 


erasure probabilities. The throughput of the Metzner protocol 
are quite close to the ideal FEC bound r = 1 — e. This 
fact indicates that the Metzner protocol provides excellent 
bandwidth efficiency that is close to optimal. On the other 
hand, the SR protocol offers poor throughput performance 
compared with the Metzner protocol. For example, the Metzner 
protocol provides r = 0.877, whereas the SR protocol yields 
r = 0.365 at e = 0.1. This result implies that the SR 
protocol cannot achieve a bandwidth efficiency close to the 
optimal performance, although it is the simplest to implement. 
The proposed protocol, index-ARQ, provides slightly smaller 
throughput compared with the Metzner protocol but the dif¬ 
ference is fairly small, especially when the erasure probability 
is small, such as e < 0.05. Even for a relatively large erasure 
probability e = 0.1, the proposed protocol achieves 92% of 
the throughput performance of the Metzner protocol. 

Figure [4] show the case in which m = 50 and n = 2000. 
In this case, we can also observe the same tendency seen in 

Fig- 0 

In order to observe the relationship between the number of 
packets and the throughput, we conducted several experiments. 
Figure [5] shows such a relationship under the condition in 
which the erasure probability is e = 0.05 and the number 
of receivers is m = 100. The horizontal axis indicates 
the number of packets, and the vertical axis indicates the 
throughput. In the case of the SR protocol, the throughputs are 
approximately constant, regardless of the number of packets. 
On the other hand, in the case of the index ARQ protocol 
and the Metzner protocol, the throughputs increase slightly as 
the number of packets increases. Moreover, the difference in 
throughput of these two protocols and the ideal FEC bound 
becomes negligible as the number of packets increases. The 
experimental results suggest a system design principle for the 
index ARQ protocol such that the number of packets should be 
larger than the number of receivers in order to achieve higher 
throughput. 
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Fig. 5. Relationship between the number of packets (n £ [10, 500]) and 
throughput (e = 0.05, m = 100) 


V. Concluding summary 

In the preset paper, we proposed an ARQ protocol, referred 
to as index ARQ, for broadcast channels. In the proposed 
protocol, the server incorporates a coded packet by adding 
several packets over F g based on the knowledge of the de¬ 
mands of all of the receivers. In order to find an appropriate 
set of indices for the packets to be added, a randomized 
greedy algorithm is devised. The bandwidth efficiency of the 
proposed protocol derives from the fact that a coded packet 
can compensate multiple packet losses in several receivers. A 
practical advantage of the proposed protocol is the simplicity 
of its decoding process at the receiver side. A decoding process 
only constitutes several additions over F g and results in a small 
computational load at the receiver side. If q = 2 m , then only 
exclusive OR operations are required to recover packet losses. 

Based on the results of computer experiments, we con¬ 
firmed that the proposed protocol achieves much higher 
throughputs than the SR protocol. The proposed protocol re¬ 
quires a certain computational load to encode at the server side. 
This computational load at the server side can be considered 
as a cost to be paid in order to achieve better bandwidth 
efficiency than the SR protocol. The throughput performance 
of the proposed protocol is close to that of the Metzner protocol 
and the ideal FEC bound when the erasure probability is in the 
range of 0 < e < 0.1, which implies that the proposed protocol 
provides approximately optimal throughput performance in 
such a regime. If a receiver is a mobile terminal with lower 
computational power or prefers a low power consumption, the 
proposed protocol would be a preferable choice in order to 
achieve both high bandwidth efficiency and lower computation 
load at the receiver side. 
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